Architectures for explicit parallelism. Multithreaded processors, small- and large-scale multiprocessor systems. Shared-memory coherence and consistency. Graphics processing units. Effect of architecture on communication latency, bandwidth, and overhead. Latency tolerance techniques. Interconnection networks. The development of programs for parallel computers. Basic concepts such as speedup, load balancing, latency, system taxonomies. Design of algorithms for idealized models. Programming on parallel systems such as shared or distributed memory machines, networks. Grid Computing. Performance analysis. Case studies.